RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 2072962 - redhat openssl leaks memory with perl-Net-SSLeay and radiator
Summary: redhat openssl leaks memory with perl-Net-SSLeay and radiator
Keywords:
Status: CLOSED DUPLICATE of bug 2097690
Alias: None
Product: Red Hat Enterprise Linux 8
Classification: Red Hat
Component: openssl-pkcs11
Version: 8.5
Hardware: x86_64
OS: Linux
medium
high
Target Milestone: rc
: ---
Assignee: Clemens Lang
QA Contact: BaseOS QE Security Team
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-04-07 10:56 UTC by Wolfgang Breyha
Modified: 2022-09-22 10:34 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-09-22 10:33:57 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
radiator test configuration (677 bytes, text/plain)
2022-04-07 10:56 UTC, Wolfgang Breyha
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker CRYPTO-7233 0 None None None 2022-05-16 13:35:33 UTC
Red Hat Issue Tracker RHELPLAN-118231 0 None None None 2022-04-07 11:02:53 UTC

Description Wolfgang Breyha 2022-04-07 10:56:27 UTC
Created attachment 1871262 [details]
radiator test configuration

Description of problem:
We recently replaced our el6 radius machines with el8 ones. Shortly after that step we recognized that radiator slowly leaks memory and depending on the request rate reaches the machines limits within days.

After some debugging we contacted the developers of radiator on their mailinglist and came to the conclusion that it seems that the patches redhat applies to openssl 1.1.1 are the cause of the leak since custom built openssl from clean 1.1.1n source does not leak memory in the same setting.

Version-Release number of selected component (if applicable):
openssl-1.1.1k-5 and 6 and openssl-1.1.1n-1 (from f35 rebuilt for el8)

How reproducible:
A basic RHEL 8.5 machine with perl-Net-SSLeay and radiator[0] installed.


To trigger the memleak we used eapol_test from wpa_supplicant RPM.

I will give a "short" description of all the steps here as well. The full mailing list thread is accessible at [1]

Steps to Reproduce:
First put the attached leak_test.cfg into /root/.
Create /root/leak_test directory
Create textfile /root/leak_test/users with "testuser\n"
# cp /opt/radiator/radiator/certificates/cert-srv.pem /root/leak_test/
# cp -a /opt/radiator/radiator/certificates/demoCA /root/leak_test/
save a pkey without passphrase 
# openssl rsa -in /root/leak_test/cert-srv.pem -out /root/leak_test/key.pem
passphrase can be found in
/opt/radiator/radiator/certificates/README-demo-CA.txt:78

edit /root/leak_test/eap_peap.testuser.conf to contain:
# cat eap_peap.testuser.conf 
network={
eap=PEAP
eapol_flags=0
key_mgmt=IEEE8021X
identity="testuser"
anonymous_identity="anonymous"
password="xxxxxxx"
ca_cert="/root/leak_test/demoCA/cacert.pem"
phase2="auth=MSCHAPV2"
}

start radiusd using
# perl /opt/radiator/radiator/radiusd -foreground -no_pid_file -config_file /root/leak_test.cfg

use eapol_test in a loop several times to watch radius growing
# for i in {1..1000}; do eapol_test -c leak_test/eap_peap.testuser.conf -s localtest -a 127.0.0.1 -N25:s:WLAN >/dev/null 2>&1; done

Due to perl memory management and the very small leak (256-1k bytes per handshake) it takes some time to get visible.

We also used valgrind and memleax to monitor radiusd and compared to a openssl without redhat patches there are reports within SSL_ functions like:
==1420461== 233,728 bytes in 913 blocks are definitely lost in loss record
6,062 of 6,088
==1420461==    at 0x4C360A5: malloc (vg_replace_malloc.c:380)
==1420461==    by 0xA39690C: CRYPTO_zalloc (in /usr/lib64/libcrypto.so.1.1.1k)
==1420461==    by 0xA382AC3: EVP_PKEY_meth_new (in
/usr/lib64/libcrypto.so.1.1.1k)
==1420461==    by 0xCF3CAD7: ??? (in /usr/lib64/engines-1.1/pkcs11.so)
==1420461==    by 0xA3648E4: ENGINE_get_pkey_meth (in
/usr/lib64/libcrypto.so.1.1.1k)
==1420461==    by 0xA382EA4: ??? (in /usr/lib64/libcrypto.so.1.1.1k)
==1420461==    by 0xA37E543: ??? (in /usr/lib64/libcrypto.so.1.1.1k)
==1420461==    by 0x9FD5A41: ??? (in /usr/lib64/libssl.so.1.1.1k)
==1420461==    by 0x9FC833E: ??? (in /usr/lib64/libssl.so.1.1.1k)
==1420461==    by 0x9FB3C97: SSL_do_handshake (in /usr/lib64/libssl.so.1.1.1k)
==1420461==    by 0x9D3ACA3: ??? (in
/usr/lib64/perl5/vendor_perl/auto/Net/SSLeay/SSLeay.so)
==1420461==    by 0x4F2F4B8: Perl_pp_entersub (in /usr/lib64/libperl.so.5.26.3)

and a "possibly lost memory"
==1420461== 640,000 bytes in 1,000 blocks are possibly lost in loss record
6,079 of 6,088
==1420461==    at 0x4C360A5: malloc (vg_replace_malloc.c:380)
==1420461==    by 0xA39690C: CRYPTO_zalloc (in /usr/lib64/libcrypto.so.1.1.1k)
==1420461==    by 0x9FBA5AC: SSL_SESSION_new (in /usr/lib64/libssl.so.1.1.1k)
==1420461==    by 0x9FBAE06: ??? (in /usr/lib64/libssl.so.1.1.1k)
==1420461==    by 0x9FD9E78: ??? (in /usr/lib64/libssl.so.1.1.1k)
==1420461==    by 0x9FC855A: ??? (in /usr/lib64/libssl.so.1.1.1k)
==1420461==    by 0x9FB3C97: SSL_do_handshake (in /usr/lib64/libssl.so.1.1.1k)
==1420461==    by 0x9D3ACA3: ??? (in
/usr/lib64/perl5/vendor_perl/auto/Net/SSLeay/SSLeay.so)
==1420461==    by 0x4F2F4B8: Perl_pp_entersub (in /usr/lib64/libperl.so.5.26.3)
==1420461==    by 0x4F27324: Perl_runops_standard (in
/usr/lib64/libperl.so.5.26.3)
==1420461==    by 0x4EA6FFC: perl_run (in /usr/lib64/libperl.so.5.26.3)
==1420461==    by 0x108ED9: ??? (in /usr/bin/perl)

or from memleax:
CallStack[24]: may-leak=6 (1536 bytes)
    expired=6 (1536 bytes), free_expired=0 (0 bytes)
    alloc=33 (8448 bytes), free=0 (0 bytes)
    freed memory live time: min=0 max=0 average=0
    un-freed memory live time: max=11
    0x00007fbbc4301a90  libc-2.28.so  __malloc()+0
    0x00007fbbc211b90d  libcrypto.so  CRYPTO_zalloc()+13
    0x00007fbbc2107ac4  libcrypto.so  EVP_PKEY_meth_new()+36
    0x00007fbbc154ead8  pkcs11.so
    0x00007fbbc20e98e5  libcrypto.so  ENGINE_get_pkey_meth()+53
    0x00007fbbc2107ea5  libcrypto.so
    0x00007fbbc2103544  libcrypto.so
    0x00007fbbc24d7a42  libssl.so
    0x00007fbbc24ca33f  libssl.so
    0x00007fbbc24b5c98  libssl.so  SSL_do_handshake()+88
    0x00007fbbc2767e84  SSLeay.so
    0x00007fbbc55144b9  libperl.so  Perl_pp_entersub()+505
    0x00007fbbc550c325  libperl.so  Perl_runops_standard()+53
    0x00007fbbc548bffd  libperl.so  perl_run()+797
    0x0000562b6f5f7eda  perl
    0x00007fbbc429f493  libc-2.28.so  __libc_start_main()+243
    0x0000562b6f5f7f1e  perl



Actual results:
radiusd leaking 256-1k memory per TLS handshake

Expected results:
no mem leak

Additional info:
[0] https://www.open.com.au/radiator/
[1] https://lists.open.com.au/pipermail/radiator/2022-April/021869.html

Comment 1 Dmitry Belyavskiy 2022-05-03 12:01:13 UTC
The 1st leak looks like a bug in pkcs11 engine - EVP_PKEY_METH should be freed on unloading the engine. Isn't it missed? 

The 2nd looks like a missing SSL_SESSION_free on an application/module level

Comment 2 Dmitry Belyavskiy 2022-05-03 12:22:43 UTC
No, all engine-registered methods are to be freed by openssl itself. Could you please check that the engine is properly shut down?

Comment 3 Dmitry Belyavskiy 2022-05-03 13:02:07 UTC
Could you please explain how do you deal with the engine? I have an impression that you initialize and free it many times.

Comment 4 Wolfgang Breyha 2022-05-03 23:28:38 UTC
I asked a maintainer of radiator to contribute since I can't answer your questions about it without assumptions. I only operate the radius servers and did the leak hunting and reported the results so far.

What I know is that radiator is a single threaded perl daemon. So I assume that it initializes and frees every TLS Session it handles and does not kill any childs like forking daemons do after a certain amount of requests. I don't know how and if the engine itself is handled differently.

Since our servers are currently running with OpenSSL 1.1.1n built from unpatched original source for a month now I can confirm that they do not leak at all like they did on el6 before.

According to information from the radiator mailing list el9 beta is not affected as well. But I didn't (lack of el9 hosts here) verify that.

Comment 5 Jakub Jelen 2022-05-04 15:19:55 UTC
can you install debuginfo for openssl and openssl-pkcs11 to get useful backtraces?

Comment 6 Wolfgang Breyha 2022-05-16 13:04:46 UTC
BTDT...

valgrind:
==476564== 240,128 bytes in 938 blocks are definitely lost in loss record 5,978 of 6,001
==476564==    at 0x4C37135: malloc (vg_replace_malloc.c:381)
==476564==    by 0xA39290C: CRYPTO_zalloc (mem.c:230)
==476564==    by 0xA37EAC3: EVP_PKEY_meth_new (pmeth_lib.c:187)
==476564==    by 0x40A4BD7: UnknownInlinedFun (p11_pkey.c:508)
==476564==    by 0x40A4BD7: PKCS11_pkey_meths (p11_pkey.c:682)
==476564==    by 0xA3608E4: ENGINE_get_pkey_meth (tb_pkmeth.c:74)
==476564==    by 0xA37EEA4: int_ctx_new (pmeth_lib.c:136)
==476564==    by 0xA37A543: do_sigver_init (m_sigver.c:29)
==476564==    by 0x9FD1A41: tls_construct_server_key_exchange (statem_srvr.c:2804)
==476564==    by 0x9FC433E: write_state_machine (statem.c:843)
==476564==    by 0x9FC433E: state_machine.part.5 (statem.c:443)
==476564==    by 0x9FAFC97: SSL_do_handshake (ssl_lib.c:3681)
==476564==    by 0x9D39893: ??? (in /usr/lib64/perl5/vendor_perl/auto/Net/SSLeay/SSLeay.so)
==476564==    by 0x4F304D8: Perl_pp_entersub (in /usr/lib64/libperl.so.5.26.3)

memleax shows more or less the same, but with less details:
CallStack[921]: memory expires with 256 bytes, backtrace:
    0x00007ff61649e660  libc-2.28.so  malloc()+0
    0x00007ff6142a090d  libcrypto.so  CRYPTO_zalloc()+13  crypto/mem.c:230
    0x00007ff61428cac4  libcrypto.so  EVP_PKEY_meth_new()+36  crypto/evp/pmeth_lib.c:187
    0x00007ff617b78bd8  ??
    0x00007ff61426e8e5  libcrypto.so  ENGINE_get_pkey_meth()+53  crypto/engine/tb_pkmeth.c:74
    0x00007ff61428cea5  libcrypto.so  ?()  crypto/evp/pmeth_lib.c:136
    0x00007ff614288544  libcrypto.so  ?()  crypto/evp/m_sigver.c:29
    0x00007ff6141a9c52  libcrypto.so  ASN1_item_verify()+674  crypto/asn1/a_verify.c:157
    0x00007ff614319914  libcrypto.so  ?()  crypto/x509/x509_vfy.c:1815
    0x00007ff61431b876  libcrypto.so  ?()  crypto/x509/x509_vfy.c:233
    0x00007ff61431bf10  libcrypto.so  X509_verify_cert()+160  crypto/x509/x509_vfy.c:303
    0x00007ff614659c26  libssl.so  ?()  ssl/statem/statem_lib.c:959
    0x00007ff61465d738  libssl.so  ?()  ssl/statem/statem_srvr.c:3812
    0x00007ff61464f33f  libssl.so  ?()  ssl/statem/statem.c:843
    0x00007ff61463ac98  libssl.so  SSL_do_handshake()+88  ssl/ssl_lib.c:3681
    0x00007ff6148ed894  SSLeay.so
    0x00007ff61769a4d9  libperl.so  Perl_pp_entersub()+505
    0x00007ff617692345  libperl.so  Perl_runops_standard()+53
    0x00007ff61761200d  libperl.so  perl_run()+797
    0x000055d2ab750eda  perl
    0x00007ff61643cca3  libc-2.28.so  __libc_start_main()+243
    0x000055d2ab750f1e  perl

Comment 7 Dmitry Belyavskiy 2022-05-16 13:09:24 UTC
I have an impression that you load engine once per handshake, which is not the best behavior. Probably you should explicitly load the engine once on initialization

Comment 8 Jakub Jelen 2022-05-17 12:59:40 UTC
From the setups steps, it is not clear if you use pkcs11 keys for something or is the issue going away just by not loading the pkcs11 engine (uninstalling openssl-pkcs11)?

Comment 9 Wolfgang Breyha 2022-05-18 09:50:46 UTC
I can confirm that removing openssl-pkcs11 silences valgrind and memleax. The upper reports are not found running radiator without it.

I changed back to stock openssl and removed pkcs11 on one of our production radius servers to see if it's memory consumption is stable as well. Will report later...

Comment 10 Wolfgang Breyha 2022-05-18 14:45:53 UTC
Our production host runs stable and without memory leak as well without pkcs11 engine installed.

Comment 11 Heikki Vatiainen 2022-05-18 17:05:19 UTC
I also did testing with and without openssl-pkcs11 package. The test is a simple: a 'while :' shell loop that calls eapol_test to run PEAP against Radiator. When openssl-pkcs11 is installed, which seems to be the default with RHEL 8 and its derivatives, Radiator process slowly grows. When I stop Radiator and do this:

% sudo rpm -e openssl-pkcs11

and then re-run Radiator and eapol_test loop, Radiator process size appears to be stable. In short: removing the pkcs11 engine helps.

Radiator attempts to load pkcs11 engine once for the process with OpenSSL ENGINE API. Typically this engine is not needed because the private key is in a regular file. Radiator could make pkcs11 engine loading configurable for those who need it.

I checked a couple of other OSes and noticed that RHEL 7 and Ubuntu 20.04 and 22.04, for example, do not come with pkcs11 engine installed by default. Package libengine-pkcs11-openssl on Ubuntu 20.04 is version 0.4.10-1 which is close to RHEL8 that has 0.4.10-2. When I repeated the test on Ubuntu 20.04 with pkcs11 engine installed, the slow leak did not appear.

It seems the pkcs11 engine has had memory leaks, for example https://github.com/OpenSC/libp11/issues/358 , and it might be they are still present in RHEL 8 version.

Comment 12 Clemens Lang 2022-06-27 13:41:40 UTC
Same issue as bug 2097690.

Comment 13 Simo Sorce 2022-09-21 20:05:11 UTC
Clemens if this is the same issue as the other bug, please close one of the 2 as duplicate.

Comment 14 Clemens Lang 2022-09-22 10:33:57 UTC
I'm closing this as duplicate of 2097690, since that has more extensive debugging and reproduction information in private comments.
Please monitor bug 2097690, and thank you very much for the report and analysis.

*** This bug has been marked as a duplicate of bug 2097690 ***


Note You need to log in before you can comment on or make changes to this bug.