Bug 1454743 - foreman-proxy memory leak when processing OpenScap report
Summary: foreman-proxy memory leak when processing OpenScap report
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Satellite
Classification: Red Hat
Component: SCAP Plugin
Version: 6.2.10
Hardware: x86_64
OS: Linux
medium
high
Target Milestone: Unspecified
Assignee: Ondřej Pražák
QA Contact: Sanket Jagtap
URL:
Whiteboard:
: 1487257 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-05-23 12:13 UTC by Pavel Moravec
Modified: 2022-07-09 09:21 UTC (History)
19 users (show)

Fixed In Version: smart_proxy_openscap-0.5.3.9
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1530700 (view as bug list)
Environment:
Last Closed: 2018-02-05 13:54:34 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
objects in memory (151.92 KB, image/png)
2017-06-02 08:36 UTC, Ondřej Pražák
no flags Details
diagnostic & cleanup-attempt patch for foreman-proxy (5.28 KB, patch)
2017-06-04 15:15 UTC, Pavel Moravec
no flags Details | Diff
task-exports rom failed hotfix system (10.76 MB, application/x-xz)
2017-10-04 10:26 UTC, maneesh verma
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Foreman Issue Tracker 20164 0 Normal Closed smart_proxy_openscap leaks memory 2020-02-24 09:12:14 UTC
Red Hat Product Errata RHSA-2018:0273 0 normal SHIPPED_LIVE Important: Red Hat Satellite 6 security, bug fix, and enhancement update 2018-02-08 00:35:29 UTC

Description Pavel Moravec 2017-05-23 12:13:54 UTC
Description of problem:
Even with the hotfix from bz1432263, tt is assumed that when a SCAP report contains nonzero "othered" counter, processing such report triggers memory leak in smart-proxy process / foreman-proxy.

No particular reproducer known ATM, but the above statement follows from:
- comparison of "ps aux" snapshots and foreman-ssl_access_ssl.log activities at the times when smart-proxy RSS usage grew
- ruling out proxy-unrelated events
- ruling out events that were present many times when no memory growth was noticed
- just arf_report remained as the only suspected
- identifying what particular SCAP report parameter is present in the "suspicious" reports but not present at other reports (when no mem.increase was seen)

I can provide the "ps aux" snapshots, together with whole /var/log content, if required.


Version-Release number of selected component (if applicable):
rubygem-smart_proxy_openscap-0.5.3.6-2.RHBZ1432263.el7sat.noarch.rpm


How reproducible:
???


Steps to Reproduce:
1. I guess it should be enough to generate a report with "othered" counter nonzero.
2. Meantime, monitor RSS usage of smart-proxy process.


Actual results:
RSS grows by each and every such report processed.


Expected results:
No RSS growth.


Additional info:
RSS growth is very evident - by 400MB to 600MB per one report (sic!).

Comment 1 Pavel Moravec 2017-05-24 08:38:55 UTC
I cant reproduce it by simple sending a report with "othered" counter > 0.

Checking what has been fixed in bz1432263, and being a ruby newbie, shouldn't the same cleanup be applied to:

https://github.com/theforeman/smart_proxy_openscap/blob/master/lib/smart_proxy_openscap/storage_fs.rb#L31

?

If some of the "Proxy::OpenSCAP::StorageFS.new(..)" calls in openscap_api.rb raises exception, shall not we cleanup OpenSCAP ?

Comment 3 Bryan Kearney 2017-05-30 17:52:18 UTC
Pavel, you can or can not reproduce it (reading comment 1)?

Comment 6 Pavel Moravec 2017-05-30 20:17:15 UTC
See the HUGE length of one arf report (Parameters: .. line having 270k - 290k characters each (!!)) - that brings 2 ideas:

- can OpenScap properly handle such huge reports? (well most often yes, and the "problematic" aren't the biggest)
- is it necessary to send by client the descriptions of each and every scap log? the server already has those details.. - this can be hugely improved/simplified with positive impact to memory

Comment 7 Dmitri Dolguikh 2017-05-31 11:00:25 UTC
I think Pavel is correct that calls to OpenSCAP.oscap_init should be followed with calls to OpenSCAP.oscap_cleanup which would deallocate memory and clean up global state.

Another thing that looks like a problem to me is that OpenSCAP.oscap_init makes a call to xmlInitParser() which, according to libxml2 docs is not reentrant. Smart-proxy is multi-threaded, and concurrent calls handled by OpenSCAP module can lead to memory leaks or undefined behavior. To correct this, OpenSCAP module initialization should be changed so that oscap_init is called once during module initialization and oscap_cleanup is called during shutdown.

Comment 8 Pavel Moravec 2017-06-01 20:40:03 UTC
Maybe stupid question but in 

https://github.com/theforeman/smart_proxy_openscap/pull/47/files#diff-fe7a79abfb039a8b67550ad94edcfd93

shall not the new cleanup method also call

@items.destroy if @items

analogously to the others variables declared/used ?

Comment 9 Ondřej Pražák 2017-06-02 08:36:09 UTC
Created attachment 1284363 [details]
objects in memory

We definitely need to clean up the @items somehow as they remain in the memory (Rule [1] and Group [2] inherit from Item). The problem is, when I try @items.map(&:destroy), I run into a segfault. I reached out to the guy who works on openscap and wrote the bindings, I'm sure he'll be able to tell if there is a problem in underlying layers or if I'm just using it wrong.

[1] https://github.com/OpenSCAP/ruby-openscap/blob/master/lib/openscap/xccdf/rule.rb 

[2] https://github.com/OpenSCAP/ruby-openscap/blob/master/lib/openscap/xccdf/group.rb

Comment 10 Pavel Moravec 2017-06-04 15:15:58 UTC
Created attachment 1284748 [details]
diagnostic & cleanup-attempt patch for foreman-proxy

FYI this diagnostic and attempt-to-cleanup-memory patch was provided to the customer.

Tried to cleanup @items and also @results - only those items that are of given data type (cleaning all really segfaults, despite there is no other type than those I clean up - interesting..).

Comment 12 Pavel Moravec 2017-06-18 14:14:18 UTC
Ondra, could you pls. have a look on the reproducer in #c11?

Comment 13 Ondřej Pražák 2017-06-22 06:42:58 UTC
This is very useful, the memory usage on proxy jumped form 0.3% to 12.8% on the first report.

Comment 14 Ondřej Pražák 2017-06-29 11:57:36 UTC
Created redmine issue http://projects.theforeman.org/issues/20164 from this bug

Comment 15 Satellite Program 2017-06-29 14:11:47 UTC
Upstream bug assigned to oprazak

Comment 16 Satellite Program 2017-06-29 14:11:53 UTC
Upstream bug assigned to oprazak

Comment 17 Satellite Program 2017-07-20 12:11:27 UTC
Moving this bug to POST for triage into Satellite 6 since the upstream issue http://projects.theforeman.org/issues/20164 has been resolved.

Comment 25 Lukas Zapletal 2017-09-05 09:24:16 UTC
*** Bug 1487257 has been marked as a duplicate of this bug. ***

Comment 26 Lukas Zapletal 2017-09-05 11:43:19 UTC
SATELLITE 6.2.11 HOTFIX INSTRUCTIONS:

On all Satellite Server and Capsules where rubygem-smart_proxy_openscap is installed perform an upgrade of the package:

yum upgrade http://people.redhat.com/~lzapleta/hotfix/openscap-memory-1454743/el7/rubygem-smart_proxy_openscap-0.5.3.9-1.HOTFIXRHBZ1454743.el7sat.noarch.rpm

If yum requests upgrade of -doc subpackage or using RHEL6 these can be found at: http://people.redhat.com/~lzapleta/hotfix/openscap-memory-1454743/

Restart foreman-proxy process, now every OpenSCAP report request, new process is spawned doing the XML parsing and foreman-proxy ruby process memory consumption is now back to normal.

This hotfix may work with any Satellite 6.2 version which has rubygem-smart_proxy_openscap-0.5.3.8 package version as it upgrades this package to version 0.5.3.9.

Comment 27 maneesh verma 2017-10-04 10:26:25 UTC
Created attachment 1334173 [details]
task-exports rom failed hotfix system

Comment 29 Sanket Jagtap 2017-10-27 07:42:36 UTC
Satellite 6.3.0 snap 21

USER        PID %CPU %MEM    VSZ   RSS TTY	STAT START   TIME COMMAND
foreman+  91962  0.0  0.2 895584 85652 ?        Sl   Oct22   0:18 ruby /usr/share/foreman-proxy/bin/smart-proxy


USER        PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
foreman+  91962  0.0  0.2 895584 85652 ?        Sl   Oct22   0:18 ruby /usr/share/foreman-proxy/bin/smart-proxy
[root@dell-per630-fc-01 ~]# ps -p 91962 -u
USER        PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
foreman+  91962  0.0  0.2 895584 85652 ?        Sl   Oct22   0:18 ruby /usr/share/foreman-proxy/bin/smart-proxy
[root@dell-per630-fc-01 ~]# ps -p 91962 -u
USER        PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
foreman+  91962  0.0  0.2 895584 85652 ?        Sl   Oct22   0:18 ruby /usr/share/foreman-proxy/bin/smart-proxy
[root@dell-per630-fc-01 ~]# ps -p 91962 -u
USER        PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
foreman+  91962  0.0  0.2 895584 85652 ?        Sl   Oct22   0:18 ruby /usr/share/foreman-proxy/bin/smart-proxy
[root@dell-per630-fc-01 ~]# ps -p 91962 -u
USER        PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
foreman+  91962  0.0  0.2 895584 85652 ?        Sl   Oct22   0:18 ruby /usr/share/foreman-proxy/bin/smart-proxy



The scap report is with othered counter > 0 is sent to satellite every min 

There is no exponential growth in the RSS Usage

Marking it verified

Comment 31 Lukas Zapletal 2018-01-11 12:41:13 UTC
I can confirm that hotfix from comment 26 will work for Satellite 6.2.11 to 6.2.13 including. Just make sure the package named:

rubygem-smart_proxy_openscap-0.5.3.9-1.HOTFIXRHBZ1454743.el7sat.noarch.rpm

is not changed, yum reinstall it if there is different version.

Comment 36 Sanket Jagtap 2018-01-15 13:55:09 UTC
Build: Satellite 6.2.14 snap1

Running satellite for 3 days with two clients sending reports every min 
With approx >11k reports
[root@ibm-x3650m4-02-vm-01 ~]#  ps -q 31504 -ux
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
foreman+ 31504  0.1  0.1 679764 64692 ?        Sl   05:16   0:17 ruby /usr/share/foreman-proxy/bin/smart-proxy

I monitored the usage for ~1hr , but there is no major increase in RSS mem consumption
The two clients are sending report every min to satellite

Comment 39 errata-xmlrpc 2018-02-05 13:54:34 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2018:0273


Note You need to log in before you can comment on or make changes to this bug.