RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1822246 - JSS - NativeProxy never calls releaseNativeResources - Memory Leak
Summary: JSS - NativeProxy never calls releaseNativeResources - Memory Leak
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 8
Classification: Red Hat
Component: jss
Version: 8.2
Hardware: All
OS: Unspecified
medium
medium
Target Milestone: rc
: 8.3
Assignee: Alex Scheel
QA Contact: PKI QE
URL:
Whiteboard:
Depends On:
Blocks: 1822402
TreeView+ depends on / blocked
 
Reported: 2020-04-08 14:51 UTC by Alex Scheel
Modified: 2020-11-04 03:16 UTC (History)
4 users (show)

Fixed In Version: pki-core-10.6-8030020200806183337.5ff1562f
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1822402 (view as bug list)
Environment:
Last Closed: 2020-11-04 03:15:11 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github dogtagpki/pki/tree/master/tests/dogtag/pytest-ansible/pytest/performance_test 0 None None None 2020-09-24 11:49:21 UTC

Description Alex Scheel 2020-04-08 14:51:15 UTC
Description of problem:

In JSS v4.6.2 shipped in RHEL 8.2, our NativeProxy class was rewritten. Rather than using an instance registry with a HashTable with a random id, a HashSet was used instead, using the default hashCode() implementation for Object.

Because of this hashCode implementation, registry.contains(NativeProxy inst) would always return false, resulting in releaseNativeResources never being called. 

This was detected via QE load testing on RHCS 10.0 GA. It also likely impacts IPA.

When issuing 100k certificates with a fixed JVM heap size of 6GB, memory usage balloons to 15GB.

Version-Release number of selected component (if applicable):

jss-v4.6.2-4

How reproducible:

Very

Steps to Reproduce:
1. See QE loadtest procedure.
2. https://polarion.engineering.redhat.com/polarion/#/project/CERT/workitem?id=CERT-15339
3.

Actual results:

Memory shouldn't leak

Expected results:

Memory leaks excessively since no NativeProxy object is ever freed.  

Additional info:

Comment 4 Alex Scheel 2020-06-08 12:20:35 UTC
Checked in upstream: 


commit 0c5f6703ce736782b554665dc6b584313757fb23
Author: Alexander Scheel <ascheel>
Date:   Wed Apr 15 10:59:16 2020 -0400

    Handle NULL pointers in releaseNativeResources
    
    In the style of the previous commit, ensure all pointers are
    non-NULL before continuing to free them. Some of these are excessive as
    NSS does do some checking, but in this case consistency is better.
    
    Signed-off-by: Alexander Scheel <ascheel>


commit 33ae12d7055271b7ff5a95867302f9c6358eeb0a
Author: Alexander Scheel <ascheel>
Date:   Mon Apr 6 16:49:49 2020 -0400

    Fix NativeProxy registry tracking
    
    When the switch was made to a HashSet-based registry in
    eb5df01003d74b57473eacb84e538d31f5bb06ca, NativeProxy didn't override
    hashCode(...). This resulted in calls to close() (and thus, finalize())
    not invoking the releaseNativeResources() function to release the
    underlying memory.
    
    Signed-off-by: Alexander Scheel <ascheel>


Pull request: https://github.com/dogtagpki/jss/pull/473

Comment 12 Alex Scheel 2020-07-29 12:46:00 UTC
This is checked in upstream. Deepak has confirmed via email that the fixes work.

commit f4a874f9355eae1e769f1798f0b9543cba61d449 ( --- causes multigigabyte LOG files --- )
Author: Alexander Scheel <ascheel>
Date:   Tue Jul 28 15:21:21 2020 -0400

    Remove invalid Base64 logging
    
    While a nice idea in theory, this generates a ton of spurious messages
    right now. We should eventually fix this and re-enable logging, but for
    now we'll remove it.
    
    Signed-off-by: Alexander Scheel <ascheel>

commit aad4e90bba0d2f4f596a492074a4209eba1be64b
Author: Alexander Scheel <ascheel>
Date:   Tue Jul 28 11:10:28 2020 -0400

    Switch NativeProxy registry to use a WeakRefMap
    
    One issue with the NativeProxy fix from last release
    (33ae12d7055271b7ff5a95867302f9c6358eeb0a) was that we now always
    stored strong references to tracked pointers rather than weak
    references. The downside of this approach is that every single reference
    must be explicitly closed rather than allowing the GC to close them as
    they go out of scope. By using a weak reference, presence in the
    NativeProxy registry is not sufficient to keep the reference around.
    
    Signed-off-by: Alexander Scheel <ascheel>

commit ea00750625d042d46c473489275573309b8e4575
Author: Alexander Scheel <ascheel>
Date:   Tue Jul 28 10:51:30 2020 -0400

    Make SSL_ImportFD clear underlying PRFileDesc
    
    When SSL_ImportFD executes successfully, the base PRFileDesc gets
    consumed by NSS. This means the Java NativeProxy wrapper (PRFDProxy) can
    get garbage collected. However, we need to ensure we don't call
    PR.Close() on the underlying socket until we no longer need the
    SSLFDProxy instance as well.
    
    This affected JSSEngine's template as well: the base PRFDProxy
    underlying the SSLFDProxy template should eventually get garbage
    collected and freed, causing the template to no longer be valid. We
    should instead allow the Java object to be GC'd without invoking
    PR.Close().
    
    Signed-off-by: Alexander Scheel <ascheel>

commit ee3dd06c598de1d4021cc5fdd53bf4e986848673
Author: Alexander Scheel <ascheel>
Date:   Fri Jul 24 12:34:38 2020 -0400

    Duplicate client certificate in handler
    
    When JSS passes the certificate for use in the client auth handler, it
    doesn't duplicate this certificate. However, NSS will later attempt to
    free this key. We should duplicate the key before returning it to NSS,
    allowing NSS to free it safely.
    
    Note that, because the key isn't passed in to the client auth handler,
    but instead queried, we need not duplicate it.
    
    Signed-off-by: Alexander Scheel <ascheel>

commit 8f771193aebb38ce465c4b66e3a8454c28b2affb
Author: Alexander Scheel <ascheel>
Date:   Fri Jul 24 12:12:12 2020 -0400

    Prevent usage of ssl_fd after closing
    
    In the previous commit, 76396ae47adf740aac0db38f143d959e5d6c39ec, by
    calling the destructor on BufferPRFD layer, we finally clean it up
    properly. However, this resulted in multiple calls to closeInbound or
    closeOutbound resulting in a use-after-free.
    
    Signed-off-by: Alexander Scheel <ascheel>

commit 81a149764ec5d5bfd9e1205c4d349819fe154f5e
Author: Alexander Scheel <ascheel>
Date:   Thu Jul 23 10:33:25 2020 -0400

    Fix memory leaks in TestBufferPRFDSSL
    
    Signed-off-by: Alexander Scheel <ascheel>

commit 76396ae47adf740aac0db38f143d959e5d6c39ec
Author: Alexander Scheel <ascheel>
Date:   Thu Jul 23 10:31:51 2020 -0400

    Fix memory leak during BufferPRFD destruction
    
    BufferPRFDs must destroy their layer when they're they only layer left,
    otherwise closing a layer will leave allocated resources around.
    
    Signed-off-by: Alexander Scheel <ascheel>

commit f426d2e452587649e67afcdf9889b5500d9e3178
Author: Alexander Scheel <ascheel>
Date:   Wed Jul 22 16:01:25 2020 -0400

    Clear PRFDProxy after import, session on close
    
    Signed-off-by: Alexander Scheel <ascheel>

commit b597226f8a243bcd8258bd0f156d8b8a1ad2cb41
Author: Alexander Scheel <ascheel>
Date:   Wed Jul 22 14:01:43 2020 -0400

    Allow sessions to clear PK11Cert instances
    
    Signed-off-by: Alexander Scheel <ascheel>

commit 68ae2b18ee8bbb245b4f64a2c7c36b60de99f53d
Author: Alexander Scheel <ascheel>
Date:   Wed Jul 22 14:01:22 2020 -0400

    Fix TokenProxy leak
    
    Signed-off-by: Alexander Scheel <ascheel>

commit bd0e60f992fa8cc6fe59f0fa414efc095a78b457
Author: Alexander Scheel <ascheel>
Date:   Wed Jul 22 10:56:27 2020 -0400

    Close SSLEngine inbound during socket close
    
    Signed-off-by: Alexander Scheel <ascheel>

commit 6fafc019ddeaba2d1bcbca574053b92550fe7c06
Author: Alexander Scheel <ascheel>
Date:   Wed Jul 22 10:56:08 2020 -0400

    Handle NULL return from SSL_ImportFD
    
    Signed-off-by: Alexander Scheel <ascheel>

Comment 16 errata-xmlrpc 2020-11-04 03:15:11 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: pki-core:10.6 and pki-deps:10.6 security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:4847


Note You need to log in before you can comment on or make changes to this bug.