Created attachment 1480086 [details]
A test program to reproduce the issue
Description of problem:
Using dlopen() and/or dlsym() in the main thread of a program linked to -lpthread lead to 32bytes never released, which are reported by valgrind:
==783108== HEAP SUMMARY:
==783108== in use at exit: 32 bytes in 1 blocks
==783108== total heap usage: 1 allocs, 0 frees, 32 bytes allocated
==783108== 32 bytes in 1 blocks are still reachable in loss record 1 of 1
==783108== at 0x4C30B06: calloc (vg_replace_malloc.c:711)
==783108== by 0x4E3C7D4: _dlerror_run (dlerror.c:140)
==783108== by 0x4E3C095: dlopen@@GLIBC_2.2.5 (dlopen.c:87)
==783108== by 0x400585: test (test.c:13)
==783108== by 0x4005A8: main (test.c:44)
==783108== LEAK SUMMARY:
==783108== definitely lost: 0 bytes in 0 blocks
==783108== indirectly lost: 0 bytes in 0 blocks
==783108== possibly lost: 0 bytes in 0 blocks
==783108== still reachable: 32 bytes in 1 blocks
==783108== suppressed: 0 bytes in 0 blocks
If dlopen() and/or dlsym() is called from another thread, there's no "leaked"* memory reported by valgrind.
When not linking against -lpthread, calling dlopen() and/or dlsym() doesn't "leak"* memory.
* I don't like calling this a leak, as it's only and only one block of 32bytes which is still reachable, but some people disagree and want *0* bytes reported by valgrind.
BTW, this issue is well known:
For me, it's an issue because I had to explain my code doesn't leak resources, glibc was "leaking" the memory for me.
It's a rather strange issue, because it happen only when linked with -lpthread, but not when actually doing dlopen() and/or dlsym() in a thread. If a subthread could release the memory, why the main thread cannot do the same ?
Also note CentOS 7.5 exhibit the same issue.
Created attachment 1480087 [details]
(In reply to Yann Droneaud from comment #1)
> Created attachment 1480087 [details]
> Associated makefile
Thanks for the report!
As coincidence would have it we ran into this issue earlier this year when working on some malloc reporting.
This is fixed by the fix for bug 23329 which was in glibc 2.28, in Fedora 29 and Rawhide (F30).
I have no plans to backport this to Fedora 28 because it's just a theoretical leak only.
Author: Carlos O'Donell <email@example.com>
Date: Fri Jun 22 09:28:47 2018 -0400
libc: Extend __libc_freeres framework (Bug 23329).
The __libc_freeres framework does not extend to non-libc.so objects.
This causes problems in general for valgrind and mtrace detecting
unfreed objects in both libdl.so and libpthread.so. This change is
a pre-requisite to properly moving the malloc hooks out of malloc
since such a move now requires precise accounting of all allocated
data before destructors are run.
This commit adds a proper hook in libc.so.6 for both libdl.so and
for libpthread.so, this ensures that shm-directory.c which uses
freeit () to free memory is called properly. We also remove the
nptl_freeres hook and fall back to using weak-ref-and-check idiom
for a loaded libpthread.so, thus making this process similar for
Lastly we follow best practice and use explicit free calls for
both libdl.so and libpthread.so instead of the generic hook process
which has undefined order.
Tested on x86_64 with no regressions.
Signed-off-by: DJ Delorie <firstname.lastname@example.org>
Signed-off-by: Carlos O'Donell <email@example.com>
Valgrind shows __libc_freeres can free the resources:
valgrind --leak-check=full --show-leak-kinds=all ./test-pthread-nothread-dlopen
==7834== Memcheck, a memory error detector
==7834== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==7834== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
==7834== Command: ./test-pthread-nothread-dlopen
==7834== HEAP SUMMARY:
==7834== in use at exit: 0 bytes in 0 blocks
==7834== total heap usage: 1 allocs, 1 frees, 32 bytes allocated
==7834== All heap blocks were freed -- no leaks are possible
==7834== For counts of detected and suppressed errors, rerun with: -v
==7834== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
I'm marking this CLOSED/RAWHIDE for now.
(In reply to Carlos O'Donell from comment #2)
> (In reply to Yann Droneaud from comment #1)
> As coincidence would have it we ran into this issue earlier this year when
> working on some malloc reporting.
> This is fixed by the fix for bug 23329 which was in glibc 2.28, in Fedora 29
> and Rawhide (F30).
Great, thanks !
(Would be great if the next RHEL/CentOS major version have the fix too).