Since March 18, the test_ctypes test of the Python buildbot "aarch64 Fedora Rawhide 3.9" started to fail randomly. The test leaks a coredump when running "ldconfig -p" command. This day, glibc was upgraded by dnf to: glibc-2.35.9000-11.fc37.aarch64. Running "ldconfig -p" in a loop does crash. Example with the shell command: $ while true; do LC_ALL=C LANG=C /sbin/ldconfig -p > /dev/null; rc=$?; echo "$(date): $rc"; if [ $rc -ne 0 ]; then break; fi; done Output: --- lun. 21 mars 2022 05:18:50 CET: 0 lun. 21 mars 2022 05:18:50 CET: 0 lun. 21 mars 2022 05:18:50 CET: 0 lun. 21 mars 2022 05:18:50 CET: 0 (...) lun. 21 mars 2022 05:18:51 CET: 0 lun. 21 mars 2022 05:18:51 CET: 0 lun. 21 mars 2022 05:18:51 CET: 0 lun. 21 mars 2022 05:18:51 CET: 0 Erreur de segmentation (core dumped) lun. 21 mars 2022 05:18:51 CET: 139 --- gdb backtrace: --- $ gdb /sbin/ldconfig -c .2896221 Core was generated by `/sbin/ldconfig -p'. Program terminated with signal SIGSEGV, Segmentation fault. #0 0x0000aaaadfed99cc in __brk (addr=<optimized out>) at ../sysdeps/unix/sysv/linux/brk.c:39 39 __set_errno (ENOMEM); (gdb) where #0 0x0000aaaadfed99cc in __brk (addr=<optimized out>) at ../sysdeps/unix/sysv/linux/brk.c:39 #1 0x0000aaaadfed9a5c in __sbrk (increment=2968) at sbrk.c:74 #2 0x0000aaaadfeb069c in __libc_setup_tls () at ../csu/libc-tls.c:151 #3 0x0000aaaadfeb02f4 in __libc_start_main_impl (main=0xaaaadfea9cb4 <_start+52>, argc=2, argv=0xfffff6badbb8, init=<optimized out>, fini=<optimized out>, rtld_fini=0x0, stack_end=<optimized out>) at ../csu/libc-start.c:304 #4 0x0000aaaadfea9cb0 in _start () at ../sysdeps/aarch64/start.S:81 --- errno is a TLS variable, but glibc failed to allocate memory for the TLS variable, so its attempts to write to NULL if I understand correctly. --- (gdb) disassemble Dump of assembler code for function __brk: 0x0000aaaadfed9990 <+0>: bti c 0x0000aaaadfed9994 <+4>: mov x1, x0 0x0000aaaadfed9998 <+8>: mov x8, #0xd6 // #214 0x0000aaaadfed999c <+12>: svc #0x0 0x0000aaaadfed99a0 <+16>: adrp x2, 0xaaaadff87000 <__pthread_keys+16016> 0x0000aaaadfed99a4 <+20>: str x0, [x2, #488] 0x0000aaaadfed99a8 <+24>: cmp x0, x1 0x0000aaaadfed99ac <+28>: b.cc 0xaaaadfed99b8 <__brk+40> // b.lo, b.ul, b.last 0x0000aaaadfed99b0 <+32>: mov w0, #0x0 // #0 0x0000aaaadfed99b4 <+36>: ret 0x0000aaaadfed99b8 <+40>: adrp x1, 0xaaaadff7f000 <tunable_list+1400> 0x0000aaaadfed99bc <+44>: ldr x1, [x1, #3512] 0x0000aaaadfed99c0 <+48>: mrs x2, tpidr_el0 0x0000aaaadfed99c4 <+52>: mov w3, #0xc // #12 0x0000aaaadfed99c8 <+56>: mov w0, #0xffffffff // #-1 => 0x0000aaaadfed99cc <+60>: str w3, [x2, x1] 0x0000aaaadfed99d0 <+64>: ret End of assembler dump. (gdb) frame #0 0x0000aaaadfed99cc in __brk (addr=<optimized out>) at ../sysdeps/unix/sysv/linux/brk.c:39 39 __set_errno (ENOMEM); (gdb) l 34 __brk (void *addr) 35 { 36 __curbrk = (void *) INTERNAL_SYSCALL_CALL (brk, addr); 37 if (__curbrk < addr) 38 { 39 __set_errno (ENOMEM); 40 return -1; 41 } 42 43 return 0; (gdb) p $x1 $1 = 64 (gdb) p $x2 $2 = 0 --- Example of strace output when the bug triggers: --- $ cat trace execve("/sbin/ldconfig", ["/sbin/ldconfig", "-p"], 0xfffff4e155a8 /* 34 vars */) = 0 geteuid() = 1002 getuid() = 1002 getegid() = 1002 getgid() = 1002 brk(NULL) = 0xaaaac182d000 brk(0xaaaac182db98) = 0xaaaac182d000 --- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0x40} --- +++ killed by SIGSEGV (core dumped) +++ --- The last brk() result 0xaaaac182d000 is smaller than the brk() argument 0xaaaac182db98, so the glibc __brk() considers that the memory allocation failed. Versions: --- $ rpm -q glibc glibc-2.35.9000-11.fc37.aarch64 $ uname -a Linux python-builder-fedora-rawhide-aarch64 5.17.0-0.rc8.123.fc37.aarch64 #1 SMP Mon Mar 14 17:54:40 UTC 2022 aarch64 aarch64 aarch64 GNU/Linux ---
Issue tracked in Python as: https://bugs.python.org/issue47078
Oh, the kernel was updated the same day. Maybe it's a kernel regression: * old kernel (ok): 5.17.0-0.rc0.20220112gitdaadb3bd0e8d.63.fc36.aarc * new kernel (bug): 5.17.0-0.rc8.123.fc37.aarch64
This bug reminds me an old kernel brk issue on AArch64: https://bugzilla.redhat.com/show_bug.cgi?id=1797052
(In reply to Victor Stinner from comment #2) > Oh, the kernel was updated the same day. Maybe it's a kernel regression: > > * old kernel (ok): 5.17.0-0.rc0.20220112gitdaadb3bd0e8d.63.fc36.aarc > * new kernel (bug): 5.17.0-0.rc8.123.fc37.aarch64 This looks like a duplicate of kernel bug 1749633 for aarch64, but perhaps for static PIE binaries only. ASLR setup is tricky, the kernel gets this wrong from time to time.
With ASLR enabled (/proc/sys/kernel/randomize_va_space = 2), "ldconfig -p" crash after between 300 and 1300 runs. With ASLR disabled (/proc/sys/kernel/randomize_va_space = 0), I fail to reproduce "ldconfig -p" crash: I stopped my test after 20,000 iterations.
Created attachment 1867144 [details] empty.c reproducer Reproducer: get attached empty.c and run: $ gcc -std=c11 -static-pie -g empty.c -o empty -O2 $ i=0; while true; do ./empty; rc=$?; i=$(($i + 1)); echo "$i: $(date): $rc"; if [ $rc -ne 0 ]; then break; fi; done (...) 1: lun. 21 mars 2022 13:48:09 CET: 0 2: lun. 21 mars 2022 13:48:09 CET: 0 3: lun. 21 mars 2022 13:48:09 CET: 0 4: lun. 21 mars 2022 13:48:09 CET: 0 5: lun. 21 mars 2022 13:48:09 CET: 0 6: lun. 21 mars 2022 13:48:09 CET: 0 Erreur de segmentation (core dumped) 7: lun. 21 mars 2022 13:48:09 CET: 139
Sadly, the final Linux 5.17 release is also affected. $ uname -r 5.17.0-128.fc37.aarch64 $ i=0; while true; do ./empty; rc=$?; i=$(($i + 1)); echo "$i: $(date): $rc"; if [ $rc -ne 0 ]; then break; fi; done (...) 252: lun. 21 mars 2022 16:55:11 CET: 0 253: lun. 21 mars 2022 16:55:11 CET: 0 254: lun. 21 mars 2022 16:55:11 CET: 0 Erreur de segmentation (core dumped) 255: lun. 21 mars 2022 16:55:11 CET: 139
I tested different kernel versions to bisect the issue, it's between builds 63 (2022-01-12 git daadb3bd0e8d) and 83 (5.17rc2): * ok: 5.17.0-0.rc0.20220112gitdaadb3bd0e8d.63.fc36.aarch64 (last built kernel without the bug) * BUG: 5.17.0-0.rc2.83.fc36.aarch64 (first built kernel with the bug) * BUG: 5.17.0-0.rc2.20220202git9f7fb8de5d9b.84.fc36.aarch64 Sadly, all builds between build 63 and build 83 failed. Just to be sure, I also tested the kernel 5.16.0-60.fc36.aarch64: ok.
According to git bisect, the bug was introduced by this change: https://github.com/torvalds/linux/commit/9630f0d60fec5fbcaa4435a66f75df1dc9704b66 commit 9630f0d60fec5fbcaa4435a66f75df1dc9704b66 Author: H.J. Lu <hjl.tools> Date: Wed Jan 19 18:09:40 2022 -0800 fs/binfmt_elf: use PT_LOAD p_align values for static PIE Extend commit ce81bb256a22 ("fs/binfmt_elf: use PT_LOAD p_align values for suitable start address") which fixed PIE binaries built with -Wl,-z,max-page-size=0x200000, to cover static PIE binaries. This fixes: https://bugzilla.kernel.org/show_bug.cgi?id=215275 Tested by verifying static PIE binaries with -Wl,-z,max-page-size=0x200000 loading. Link: https://lkml.kernel.org/r/20211209174052.370537-1-hjl.tools@gmail.com Signed-off-by: H.J. Lu <hjl.tools> Cc: Chris Kennelly <ckennelly> Cc: Al Viro <viro.org.uk> Cc: Alexey Dobriyan <adobriyan> Cc: Song Liu <songliubraving> Cc: David Rientjes <rientjes> Cc: Ian Rogers <irogers> Cc: Hugh Dickins <hughd> Cc: Suren Baghdasaryan <surenb> Cc: Sandeep Patil <sspatil> Cc: Fangrui Song <maskray> Cc: Nick Desaulniers <ndesaulniers> Cc: Kirill A. Shutemov <kirill.shutemov.com> Cc: Mike Kravetz <mike.kravetz> Cc: Shuah Khan <shuah> Signed-off-by: Andrew Morton <akpm> Signed-off-by: Linus Torvalds <torvalds> fs/binfmt_elf.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
This change was following by the following fix, but this fix doesn't impact static-pie programs (these programs have interpreter=NULL): https://github.com/torvalds/linux/commit/925346c129da1171222a9cdb11fa2b734d9955da commit 925346c129da1171222a9cdb11fa2b734d9955da Author: Mike Rapoport <rppt> Date: Fri Feb 11 16:32:22 2022 -0800 fs/binfmt_elf: fix PT_LOAD p_align values for loaders Rui Salvaterra reported that Aisleroit solitaire crashes with "Wrong __data_start/_end pair" assertion from libgc after update to v5.17-rc1. Bisection pointed to commit 9630f0d60fec ("fs/binfmt_elf: use PT_LOAD p_align values for static PIE") that fixed handling of static PIEs, but made the condition that guards load_bias calculation to exclude loader binaries. Restoring the check for presence of interpreter fixes the problem. Link: https://lkml.kernel.org/r/20220202121433.3697146-1-rppt@kernel.org Fixes: 9630f0d60fec ("fs/binfmt_elf: use PT_LOAD p_align values for static PIE") Signed-off-by: Mike Rapoport <rppt.com> Reported-by: Rui Salvaterra <rsalvaterra> Tested-by: Rui Salvaterra <rsalvaterra> Cc: Alexander Viro <viro.org.uk> Cc: Eric Biederman <ebiederm> Cc: "H.J. Lu" <hjl.tools> Cc: Kees Cook <keescook> Signed-off-by: Andrew Morton <akpm> Signed-off-by: Linus Torvalds <torvalds> fs/binfmt_elf.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
In the brk() syscall, the following check fails when the bug occurs: --- /* Check against existing mmap mappings. */ next = find_vma(mm, oldbrk); if (next && newbrk + PAGE_SIZE > vm_start_gap(next)) goto out; ---
I reported the issue to the kernel upstream bug tracker: https://bugzilla.kernel.org/show_bug.cgi?id=215720
Apparently the revert made it into v5.18-rc3: commit 354e923df042a11d1ab8ca06b3ebfab3a018a4ec Author: Andrew Morton <akpm> Date: Thu Apr 14 19:13:55 2022 -0700 revert "fs/binfmt_elf: fix PT_LOAD p_align values for loaders" Commit 925346c129da11 ("fs/binfmt_elf: fix PT_LOAD p_align values for loaders") was an attempt to fix regressions due to 9630f0d60fec5f ("fs/binfmt_elf: use PT_LOAD p_align values for static PIE"). commit aeb7923733d100b86c6bc68e7ae32913b0cec9d8 Author: Andrew Morton <akpm> Date: Thu Apr 14 19:13:58 2022 -0700 revert "fs/binfmt_elf: use PT_LOAD p_align values for static PIE" It was Cc:ed to <stable.org>, so hopefully it will make it into a 5.17.z kernel, too.
It has been in since 5.17.4, which was the first 5.17 rebase to stable Fedora releases.
I confirm that the bug is fixed in the Fedora package kernel-5.18.0-0.rc4.20220427git46cf2c613f4b10e.35.fc37.aarch64. I tested the following shell command --- i=0; while true; do ldconfig -V >/dev/null; rc=$?; i=$(($i + 1)); echo "$i: $(date): $rc"; if [ $rc -ne 0 ]; then break; fi; done --- * With old kernel 5.17.0-128.fc37.aarch64: the command crash in less than 1,000 iterations * With new kernel 5.18.0-0.rc4.20220427git46cf2c613f4b10e.35.fc37.aarch64: there is no crash after 14,000 iterations (I stopped the test) Since the change introducing the regression has been reverted, can this issue be closed? Or do you want to keep it reopen until the upstream issue is closed? https://bugzilla.kernel.org/show_bug.cgi?id=215720