Bug 1708047 - glusterfsd memory leak after enable tls/ssl
Summary: glusterfsd memory leak after enable tls/ssl
Keywords:
Status: CLOSED DUPLICATE of bug 1768407
Alias: None
Product: GlusterFS
Classification: Community
Component: rpc
Version: mainline
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
Assignee: bugs@gluster.org
QA Contact:
URL:
Whiteboard:
Depends On: 1707227 1768339
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-05-09 04:13 UTC by Raghavendra G
Modified: 2020-01-27 11:39 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1707227
Environment:
Last Closed: 2020-01-27 10:52:38 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Gluster.org Gerrit 22687 None Abandoned rpc/socket: After enabling TLS, glusterfsd memory leak found 2019-06-10 09:15:31 UTC

Description Raghavendra G 2019-05-09 04:13:53 UTC
+++ This bug was initially created as a clone of Bug #1707227 +++

Description of problem:

glusterfsd memory leak found
Version-Release number of selected component (if applicable):

3.12.15
How reproducible:
while true;do gluster v heal <vol-name> info;done
and open another session to check the memory usage of the <vol-name> related glusterfsd process, the memory will keep increasing until around 370M then increase will stop

Steps to Reproduce:
1.while true;do gluster v heal <vol-name> info;done
2.check the memory usage of the <vol-name> related glusterfsd process
3.

Actual results:
the memory will keep increasing until around 370M then increase will stop

Expected results:
memory stable

Additional info:
with memory scan tool vlagrand attached to glusterfsd process and libleak attached to glusterfsd process seems ssl_accept is suspicious, not sure it is caused by ssl_accept or glusterfs mis-use of ssl:
==16673== 198,720 bytes in 12 blocks are definitely lost in loss record 1,114 of 1,123
==16673== at 0x4C2EB7B: malloc (vg_replace_malloc.c:299)
==16673== by 0x63E1977: CRYPTO_malloc (in /usr/lib64/libcrypto.so.1.0.2p)
==16673== by 0xA855E0C: ssl3_setup_write_buffer (in /usr/lib64/libssl.so.1.0.2p)
==16673== by 0xA855E77: ssl3_setup_buffers (in /usr/lib64/libssl.so.1.0.2p)
==16673== by 0xA8485D9: ssl3_accept (in /usr/lib64/libssl.so.1.0.2p)
==16673== by 0xA610DDF: ssl_complete_connection (socket.c:400)
==16673== by 0xA617F38: ssl_handle_server_connection_attempt (socket.c:2409)
==16673== by 0xA618420: socket_complete_connection (socket.c:2554)
==16673== by 0xA618788: socket_event_handler (socket.c:2613)
==16673== by 0x4ED6983: event_dispatch_epoll_handler (event-epoll.c:587)
==16673== by 0x4ED6C5A: event_dispatch_epoll_worker (event-epoll.c:663)
==16673== by 0x615C5D9: start_thread (in /usr/lib64/libpthread-2.27.so)
==16673==
==16673== 200,544 bytes in 12 blocks are definitely lost in loss record 1,115 of 1,123
==16673== at 0x4C2EB7B: malloc (vg_replace_malloc.c:299)
==16673== by 0x63E1977: CRYPTO_malloc (in /usr/lib64/libcrypto.so.1.0.2p)
==16673== by 0xA855D12: ssl3_setup_read_buffer (in /usr/lib64/libssl.so.1.0.2p)
==16673== by 0xA855E68: ssl3_setup_buffers (in /usr/lib64/libssl.so.1.0.2p)
==16673== by 0xA8485D9: ssl3_accept (in /usr/lib64/libssl.so.1.0.2p)
==16673== by 0xA610DDF: ssl_complete_connection (socket.c:400)
==16673== by 0xA617F38: ssl_handle_server_connection_attempt (socket.c:2409)
==16673== by 0xA618420: socket_complete_connection (socket.c:2554)
==16673== by 0xA618788: socket_event_handler (socket.c:2613)
==16673== by 0x4ED6983: event_dispatch_epoll_handler (event-epoll.c:587)
==16673== by 0x4ED6C5A: event_dispatch_epoll_worker (event-epoll.c:663)
==16673== by 0x615C5D9: start_thread (in /usr/lib64/libpthread-2.27.so)
==16673==
valgrind --leak-check=f

also, with another memory leak scan tool libleak:
callstack[2419] expires. count=1 size=224/224 alloc=362 free=350
/home/robot/libleak/libleak.so(malloc+0x25) [0x7f1460604065]
/lib64/libcrypto.so.10(CRYPTO_malloc+0x58) [0x7f145ecd9978]
/lib64/libcrypto.so.10(EVP_DigestInit_ex+0x2a9) [0x7f145ed95749]
/lib64/libssl.so.10(ssl3_digest_cached_records+0x11d) [0x7f145abb6ced]
/lib64/libssl.so.10(ssl3_accept+0xc8f) [0x7f145abadc4f]
/usr/lib64/glusterfs/3.12.15/rpc-transport/socket.so(ssl_complete_connection+0x5e) [0x7f145ae00f3a]
/usr/lib64/glusterfs/3.12.15/rpc-transport/socket.so(+0xc16d) [0x7f145ae0816d]
/usr/lib64/glusterfs/3.12.15/rpc-transport/socket.so(+0xc68a) [0x7f145ae0868a]
/usr/lib64/glusterfs/3.12.15/rpc-transport/socket.so(+0xc9f2) [0x7f145ae089f2]
/lib64/libglusterfs.so.0(+0x9b96f) [0x7f146038596f]
/lib64/libglusterfs.so.0(+0x9bc46) [0x7f1460385c46]
/lib64/libpthread.so.0(+0x75da) [0x7f145f0d15da]
/lib64/libc.so.6(clone+0x3f) [0x7f145e9a7eaf]
callstack[2432] expires. count=1 size=104/104 alloc=362 free=0
/home/robot/libleak/libleak.so(malloc+0x25) [0x7f1460604065]
/lib64/libcrypto.so.10(CRYPTO_malloc+0x58) [0x7f145ecd9978]
/lib64/libcrypto.so.10(BN_MONT_CTX_new+0x17) [0x7f145ed48627]
/lib64/libcrypto.so.10(BN_MONT_CTX_set_locked+0x6d) [0x7f145ed489fd]
/lib64/libcrypto.so.10(+0xff4d9) [0x7f145ed6a4d9]
/lib64/libcrypto.so.10(int_rsa_verify+0x1cd) [0x7f145ed6d41d]
/lib64/libcrypto.so.10(RSA_verify+0x32) [0x7f145ed6d972]
/lib64/libcrypto.so.10(+0x107ff5) [0x7f145ed72ff5]
/lib64/libcrypto.so.10(EVP_VerifyFinal+0x211) [0x7f145ed9dd51]
/lib64/libssl.so.10(ssl3_get_cert_verify+0x5bb) [0x7f145abac06b]
/lib64/libssl.so.10(ssl3_accept+0x988) [0x7f145abad948]
/usr/lib64/glusterfs/3.12.15/rpc-transport/socket.so(ssl_complete_connection+0x5e) [0x7f145ae00f3a]
/usr/lib64/glusterfs/3.12.15/rpc-transport/socket.so(+0xc16d) [0x7f145ae0816d]
/usr/lib64/glusterfs/3.12.15/rpc-transport/socket.so(+0xc68a) [0x7f145ae0868a]
/usr/lib64/glusterfs/3.12.15/rpc-transport/socket.so(+0xc9f2) [0x7f145ae089f2]
/lib64/libglusterfs.so.0(+0x9b96f) [0x7f146038596f]
/lib64/libglusterfs.so.0(+0x9bc46) [0x7f1460385c46]
/lib64/libpthread.so.0(+0x75da) [0x7f145f0d15da]
/lib64/libc.so.6(clone+0x3f) [0x7f145e9a7eaf]

--- Additional comment from zhou lin on 2019-05-08 07:49:03 UTC ---

thanks for your respond!
glusterfsd process does call SSL_free interface, however, the ssl context is a shared one between many ssl object. do you think it is possible that if we keep the shared ssl context will cause this memory leak?

--- Additional comment from Worker Ant on 2019-05-09 02:59:43 UTC ---

REVIEW: https://review.gluster.org/22687 (After enabling TLS, glusterfsd memory leak found) posted (#1) for review on master by None

Comment 1 Worker Ant 2019-05-09 04:18:54 UTC
REVIEW: https://review.gluster.org/22687 (rpc/socket: After enabling TLS, glusterfsd memory leak found) posted (#2) for review on master by Raghavendra G

Comment 2 Xavi Hernandez 2020-01-27 10:52:38 UTC
This is already solved in bug #1768407

*** This bug has been marked as a duplicate of bug 1768407 ***


Note You need to log in before you can comment on or make changes to this bug.