With gcc-7.2.1-4.fc28.armv7hl, compiling glibc with -fstack-clash-protection suceeds, but the the valgrind check for the newly build glibc fails: + elf/ld.so --library-path .:elf:nptl:dlfcn /usr/bin/valgrind --error-exitcode=1 elf/ld.so --library-path .:elf:nptl:dlfcn /usr/bin/true ==23302== Memcheck, a memory error detector ==23302== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al. ==23302== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info ==23302== Command: elf/ld.so --library-path .:elf:nptl:dlfcn /usr/bin/true ==23302== ==23302== Invalid write of size 4 ==23302== at 0x10945C: _dl_start (rtld.c:442) ==23302== by 0x108C8F: ??? (in /builddir/build/BUILD/glibc-2.26.9000-936-g37ac8e635a/build-armv7hl-redhat-linuxeabi/elf/ld.so) ==23302== Address 0xbd8418f4 is on thread 1's stack ==23302== 72 bytes below stack pointer ==23302== ==23302== Invalid write of size 4 ==23302== at 0x113C64: _dl_setup_hash (dl-lookup.c:936) ==23302== by 0x109717: _dl_start_final (rtld.c:391) ==23302== by 0x109717: _dl_start (rtld.c:519) ==23302== by 0x108C8F: ??? (in /builddir/build/BUILD/glibc-2.26.9000-936-g37ac8e635a/build-armv7hl-redhat-linuxeabi/elf/ld.so) ==23302== Address 0xbd841908 is on thread 1's stack ==23302== 8 bytes below stack pointer It even runs into a segmentation fault when running under valgrind: ==23302== Invalid write of size 4 ==23302== at 0x120D94: free (dl-minimal.c:109) ==23302== by 0x110557: fillin_rpath (dl-load.c:526) ==23302== by 0x110C53: _dl_init_paths (dl-load.c:815) ==23302== by 0x10B913: dl_main (rtld.c:1317) ==23302== by 0x1205D7: _dl_sysdep_start (dl-sysdep.c:253) ==23302== by 0x10975B: _dl_start_final (rtld.c:412) ==23302== by 0x10975B: _dl_start (rtld.c:519) ==23302== by 0x108C8F: ??? (in /builddir/build/BUILD/glibc-2.26.9000-936-g37ac8e635a/build-armv7hl-redhat-linuxeabi/elf/ld.so) ==23302== Address 0xbd841658 is on thread 1's stack ==23302== 16 bytes below stack pointer ==23302== ==23302== Invalid write of size 4 ==23302== at 0x109D9C: handle_ld_preload (rtld.c:835) ==23302== by 0x10BB27: dl_main (rtld.c:1612) ==23302== by 0x1205D7: _dl_sysdep_start (dl-sysdep.c:253) ==23302== by 0x10975B: _dl_start_final (rtld.c:412) ==23302== by 0x10975B: _dl_start (rtld.c:519) ==23302== by 0x108C8F: ??? (in /builddir/build/BUILD/glibc-2.26.9000-936-g37ac8e635a/build-armv7hl-redhat-linuxeabi/elf/ld.so) ==23302== Address 0xbd8406ec is not stack'd, malloc'd or (recently) free'd ==23302== ==23302== ==23302== Process terminating with default action of signal 11 (SIGSEGV): dumping core ==23302== Access not within mapped region at address 0xBD8406EC ==23302== at 0x109D9C: handle_ld_preload (rtld.c:835) ==23302== by 0x10BB27: dl_main (rtld.c:1612) ==23302== by 0x1205D7: _dl_sysdep_start (dl-sysdep.c:253) ==23302== by 0x10975B: _dl_start_final (rtld.c:412) ==23302== by 0x10975B: _dl_start (rtld.c:519) ==23302== by 0x108C8F: ??? (in /builddir/build/BUILD/glibc-2.26.9000-936-g37ac8e635a/build-armv7hl-redhat-linuxeabi/elf/ld.so) ==23302== If you believe this happened as a result of a stack ==23302== overflow in your program's main thread (unlikely but ==23302== possible), you can try to increase the size of the ==23302== main thread stack using the --main-stacksize= flag. ==23302== The main thread stack size used in this run was 8388608. I haven't verified if this is actually caused by -fstack-clash-protection, but I suspect it is.
Here's Jeff Law's analysis: A reminder, we never did a stack clash specific prologue implementation for 32bit ARM. Instead we rely on the older -fstack-check bits that were done for Ada eons ago. Those bits give a degree of protection, but were never (to my knowledge) vetted to work with valgrind. If we look at arm_emit_probe_stack_range it's pretty obvious what's happening. /* See if we have a constant small number of probes to generate. If so, that's the easy case. */ if (size <= PROBE_INTERVAL) { emit_move_insn (reg1, GEN_INT (first + PROBE_INTERVAL)); emit_set_insn (reg1, gen_rtx_MINUS (Pmode, stack_pointer_rtx, reg1)); emit_stack_probe (plus_constant (Pmode, reg1, PROBE_INTERVAL - size)); } ie: r1 = PROBE_INTERVAL r1 = sp - reg1 *r1 = 0; That's going to do a write out of the stack bounds every time. It's one of the fundamental problems with the -fstack-check support for 32bit ARM. So to reiterate, this is precisely the kind of problem we avoid by having stack-clash specific prologues on the Red Hat Enterprise Linux architectures. We didn't do a 32bit ARM implementation and instead rely on the limited protections provided by the Ada -fstack-check bits.
Per comment 1, there is simply not enough upstream support to fix this. We will have to build armhfp without -fstack-clash-protection.