On Fedora 33 with the system jemalloc, a simple C program just deadlocks: (gdb) bt #0 __lll_lock_wait (futex=0x7ffff76032a8, private=0) at lowlevellock.c:52 #1 0x00007ffff7967763 in __GI___pthread_mutex_lock (mutex=0x7ffff76032a8) at ../nptl/pthread_mutex_lock.c:80 #2 0x00007ffff7ba0475 in je_malloc_mutex_lock_slow () from /lib64/libjemalloc.so.2 #3 0x00007ffff7bbeef8 in extent_recycle.isra () from /lib64/libjemalloc.so.2 #4 0x00007ffff7b6c4d7 in arena_bin_malloc_hard.lto_priv () from /lib64/libjemalloc.so.2 #5 0x00007ffff7bc65ed in je_arena_tcache_fill_small.constprop () from /lib64/libjemalloc.so.2 #6 0x00007ffff7b5d558 in je_malloc_default () from /lib64/libjemalloc.so.2 #7 0x00007ffff79fa8e4 in __GI__IO_file_doallocate ( fp=0x7ffff7b4b520 <_IO_2_1_stdout_>) at filedoalloc.c:101 #8 0x00007ffff7a092a0 in __GI__IO_doallocbuf ( fp=0x7ffff7b4b520 <_IO_2_1_stdout_>) at libioP.h:948 #9 __GI__IO_doallocbuf (fp=0x7ffff7b4b520 <_IO_2_1_stdout_>) at genops.c:342 #10 0x00007ffff7a08438 in _IO_new_file_overflow ( f=0x7ffff7b4b520 <_IO_2_1_stdout_>, ch=-1) at fileops.c:745 #11 0x00007ffff7a074e6 in _IO_new_file_xsputn (n=4, data=<optimized out>, f=<optimized out>) at libioP.h:948 #12 _IO_new_file_xsputn (f=0x7ffff7b4b520 <_IO_2_1_stdout_>, data=<optimized out>, n=4) at fileops.c:1197 #13 0x00007ffff79f2219 in outstring_func (done=0, length=<optimized out>, string=<optimized out>, s=0x7ffff7b4b520 <_IO_2_1_stdout_>) at ../libio/libioP.h:948 #14 __vfprintf_internal (s=0x7ffff7b4b520 <_IO_2_1_stdout_>, format=0x402010 "%ld\n", ap=0x7fffffffdcc0, mode_flags=0) at vfprintf-internal.c:1646 #15 0x00007ffff79de4af in __printf (format=<optimized out>) at printf.c:33 #16 0x0000000000401156 in main () C sources: #include <unistd.h> #include <stdio.h> int main (void) { printf ("%ld\n", sysconf (_SC_PAGESIZE)); } Debugging this is difficult because of the lack of audit namespace support in GDB. --- Additional comment from Florian Weimer on 2020-09-15 13:34:02 UTC --- Sorry, forgot to mention that jemalloc is linked with -ljemalloc (no LD_PRELOAD). --- Additional comment from Siddhesh Poyarekar on 2020-10-28 02:12:43 UTC --- The full command with upstream glibc to reproduce the deadlock in comment 3: env LD_AUDIT=./libaudit.so \ GLIBC_TUNABLES=glibc.rtld.optional_static_tls=5120 \ $builddir/elf/ld.so \ --library-path $builddir:$builddir/elf:$builddir/nptl \ ./jemalloc --- Additional comment from Siddhesh Poyarekar on 2020-12-11 14:19:39 UTC --- I spent some time debugging this today and the root cause is that jemalloc, when linked in directly, comes into use before pthreads are initialized. libc.so has symbols for pthread_mutex_lock and pthread_mutex_unlock to take care of that, wherein those operations become nops until the pthreads subsystem is initialized. The twist in the plot is that jemalloc uses *pthread_mutex_trylock*, which does not have a forwarder in libc.so and actually sets the lock primitive. Its paired pthread_mutex_unlock is still a nop because it's still too early, thus setting the stage for the deadlock we see. It's straightforward to fix this by adding a forwarder for pthread_mutex_trylock in libc.so, but on discussion with Florian, we agreed to move all of the mutex functions into libc.so instead, since that's something we want to do anyway.
This message is a reminder that Fedora 32 is nearing its end of life. Fedora will stop maintaining and issuing updates for Fedora 32 on 2021-05-25. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as EOL if it remains open with a Fedora 'version' of '32'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version. Thank you for reporting this issue and we are sorry that we were not able to fix it before Fedora 32 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged change the 'version' to a later Fedora version prior this bug is closed as described in the policy above. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete.
This should be fixed in rawhide. I need to verify it and close this as done.
Fix has been pushed to rawhide.